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ABSTRACT 

We propose a method for testing Cosmic homogeneity based on the Shannon 
entropy in Information theory and test the potentials and limitations of the method on 
Monte Carlo simulations of some homogeneous and inhomogeneous 3D point process 
in a finite region of space. We analyze a set of N-body simulations to investigate the 
prospect of determining the scale of homogeneity with the proposed method and show 
that the method could serve as an efficient tool for the study of homogeneity. 
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1 INTRODUCTION 

The cosmological principle which assumes that the Uni- 
verse is statistically homogeneous and isotropic on very large 
scales is one of the fundamental pillars of modern cosmol- 
ogy. This can not be proved in a mathematical sense and 
can be only verified from observations and various predic- 
tions of the physical theories based on it. The cosmic mi- 
crowave background i s by far the best conclusive evidence 
in favour of isotropv |Penzias fe Wilson! 1 19651 : 1 S moot et al.l 
1 19921 : iFixsen et al.ll 19961 ) which also strongly supports large 
scale homogeneity in the early Universe. Various other ob- 
servations li ke the isotropv in angular distributions of ra- 
dio sources jBlake fc Wafjl2002l) and the isotropy of the X- 
ray b ackground l| Peebles! 1 1991 : IWu et al.|[l999l : IScharf et ail 
2000) support the assumption of cosmic homogeneity on 
large scales. The isotropy does not by itself guarantee ho- 
mogeneity and it implies homogeneity only when there is 
isotropy around every points. The present Universe is known 
to be highly inhomogeneous on small scales and there are im- 
portant consequences if the inhomogeneities persist on large 
scales. The most important implication of inhomogeneities 
comes from the averaging problem in General Relativity 
through their effect on the large scale dynamics known as 
backreaction mechanism. The backreaction mechanism can 
cause a global cosmic ac celeration without any additional 
dark energy component dBuchert fc Ehlersl 1 19971 : ISchwarJ 
120021 : iKolb et al.ll2006l : lBucherdl2008l ) al though it seems un- 
likely that this can explain all of it (|Paraniape fc Singh! 
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2008). The implications of inhomogeneities and backreac- 
tion for Cosmology are still considered to be important even 
if it does not provide an alternate explanation of dark energy 
|Paraniapell2009l : lKolb et al.ll2010l : Ie1Hs!|2011m . 

The principle of cosmic homogeneity demands that the 
statistical properties of the observed galaxy distribution in 
a given finite volume does not depend on the location of 
that volume in the Universe. The statistical properties of 
galaxy dis tributions are characterized by the correlation 
functions ^Peebles! Il980). The two point correlation func- 
tion on small scales 0.1 /i -1 Mpc < r < 10h _1 Mpc, is well 
described by a power law of the form £(r) = (^-)~ 7 , with 
correlation length ro ~ 5ft _1 Mpc and slope 7 ~ 1.8. £(r) 
vanishes at scales > 20/i _1 Mpc which is consistent with 
large scale homogeneity. However the problem with corre- 
lation function analysis is that it assumes a mean density 
on the scale of survey which is not a defined quantity be- 
low the scale of homogeneity. Most of the statistical tools 
for homogeneity analysis are based on the simple number 
counts n(< r) in spheres of radius r which is expected to 



scale as 



for a homogeneous distribution. The condi- 



tional density (Ho gg et alj 120051 : ISvlos Labinil l2011al ) mea- 
sures the average density in these spheres which is expected 
to flatten o ut beyond the scale of homogeneity. The frac- 
tal analysis dMartinez fc Jones|[l990l ; IColeman fc Pietronerd 
ll992l : lBorganilll995l ) uses the scaling of different moments 
of n(< r) to characterize the scale of homogeneity. Some 
of the studies carried out with these methods on differ- 
ent galaxy surveys claim to have found a transition to 
homogeneity on sufficiently large scales 70 — 150fc~ 1 Mpc 
jMartinez fc Coles! Il994l ; lGuzzolll997l : iMartinez et~aH 1 19981 : 
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Bharadwai ct al. 


19991; Pan & Colesl 20(4 Kurokawa et all 


2001|;IHoek et al. 


2005;lYadav et al.ll2005l; Sarkar et al.ll2009l; 


Scrimeeour et al 


|2012| , ( whereas some studies claim the ab- 



St=i -^> r = Here pi, r — "i K ^~ r ' 3 ' is the density at the t 



sence of any such transition out to scale of the survey 
dColeman fc Pietronero|ll992i; lAmendola fc Palladincj fl999l ; 
ISvlos Labini et al.1 120071 . l2009al |bl: ISvlos Labinill2oTTbl l. The 
disagreements between various studies indicate the need for 
some alternative measures of homogeneity which could cap- 
ture interesting information on different aspects of a homo- 
geneous distribution and serve as complimentary and alter- 
native tool to the existing methods in the literature. In the 
present work we introduce a method to asses homogene- 
ity which is based on the evenness, a more general and ro- 
bust aspect of any homogeneous distribution. We employ the 
Shannon entropy (|Shanno nll948l ) to measure the unevenness 
characterizing inhomogeneities in a distribution and explore 
the possibility of employing the proposed method for explor- 
ing the scale of homogeneity. 

A brief outline of the paper follows. We describe our 
method and various effects in Section 2, describe the tests 
and the data in Section 3 and presents the results and con- 
clusions in Section 4. 



2 METHOD OF ANALYSIS 

Our method is based on the Shannon entropy in Information 
theory originally proposed by Claude Shannon to quantify 
the information content in strings of text. 

In Information theory entropy is a measure of the 
amount of information required to describe the random vari- 
able. The Shannon entropy for a discrete random variable 
X with n outcomes {xi : i = 1, ....n} is a measure of uncer- 
tainty denoted by H(X) defined as, 



H(X) = - p(xj) logp(x t 



(1) 



where p(x) is the probability distribution of the random 
variable X. Increase in Shannon entropy increases the uncer- 
tainty and decreases the information about the knowledge of 
the random variable. Another interesting aspect of Shannon 
entropy is its entropy-maximizing property for an uniform 
distribution. 

We propose a method to study inhomogeneities in a 3D 
distribution of points. Given a set of N points distributed 
in 3D we consider each of the i th points as center and 
determine rii(< r) the number of other points within a 
sphere of radius r as, 



(2) 



where O is the Heaviside step function and xi and xj are 
the radius vector of i th and j th points respectively. To avoid 
any edge effects we discard all the points as centers which 
lie within a distance r from the survey boundary. Clearly 
the number of valid centers will decrease with increasing r 
for any finite volume sample. We define a separate random 
variable X r for each radius r which has M(r) possible out- 
comes each given by, ft r — A .f ( ';r with the constraint 



n i (<r) 
~l 3~ 

center. Note that for a given sample with a finite volume, 
M(r) is the maximum numbers of valid centers available at 
a radius r i.e. there are no provision for projecting further 
another sphere of radius r within the given volume. 

The Shannon entropy associated with the random vari- 
able X r can be written as, 

M(r) 

H r — — fi, r log fi, r 



i=l 

A/(r) 



i <S~^ \\ Si=i r> rn{<r)log(rn{<r 

- ,oa ^ ' !<r " — ar-«r> 1 31 

where the base of the logarithm is arbitrary and we choose 
it to be 10. Note that in X r , f r and H r , r is a just a label 
for a number and not an argument. 

In an ideal situation when all the spheres around the 
M(r) valid centers are equally populated then one gets an 
uniform value of fi lT — jjrp. for all the centers maximizing 
the uncertainty. Then the Shannon entropy H r (equation 3 
) has its maximum value (H r ) max = log M(r) for radius r. 
The relative Shannon entropy , H y — at any r quantifies 
the degree of uncertainty in the knowledge of the random 
variable X r . When 
random variable X r 



(H r )r, 



= 1 the knowledge about the 
becomes most uncertain. The distribu- 
tion of f r also become completely uniform when ( Br \ r — - = 1 
is reached. Equivalently one can use 1 — Tjj-y- — to quantify 
the information available in X r at any r. 

The joint Shannon entropy of a set of independent ran- 
dom variables is simply the sum of the individual entropies 
associated with each of the random variables. But if they are 
dependent then the total entropy is a sum of the conditional 
entropies. Entropy measures uncertainty in the random vari- 
able whereas the information is a difference in uncertainty 
that is a difference in entropies. The mutual information 
characterizes the reduction in the uncertainty in the one 
random variable due to the knowledge of the other. 

In our case the mutual information between the random 
variables X r are always positive as the random variables 
are not independent due to the fact that the density mea- 
surements around M(r) centers at each radius r are taken 
from the same finite volume sample. Given that the mu- 
tual information are always positive for correlated random 
variables, an increase in information in X r at one r would 
lead to decrease in information in X r s at other r values. 
Apart from the correlations from finite sample, the random 
variables could have extra correlations if the points are clus- 
tered or if the points are distributed in a preferred way. 
These extra correlations increase the mutual information 
in the random variables X r . The decrease of information 
in X r with increasing r is evident irrespective of a homo- 
geneous/inhomogeneous distribution but it would diminish 
differently depending on the nature and the degree of in- 
homogeneity present in the distribution. Maximum uncer- 
tainty or the complete loss of information in the knowledge 
of the random variable X r at a radius r suggests that be- 
yond r the random variables X r would be independent and 
completely uninformative about each other. In an infinite 
perfectly homogeneous system if the set of spheres used for 
density measurements are completely independent then the 
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random variables X r at each r has no knowledge about each 
other making them maximally uncertain and devoid of any 
information. When no inhomogeneities are present due to 



other sources, the departure of 



(Hr-)r, 



from 1 would be 



solely due to the correlations among X r caused by the finite 
volume of the sample. Given the other sources are present 
one would expect a larger departure of this ratio from 1 
and a more uneven distribution of f r . The scale where this 
departure levels up with 1 indicates absence of any corre- 
lations among X r beyond that scale. But this ideal situa- 
tion may never occur in an exact sense in a finite sample as 
the correlations between X r s introduced by the finite vol- 
ume and confinement bias persists over the whole range of 
length scales. This would always give some residual infor- 
mation in X r even at the largest value of r. But the mutual 
information content due to clustering or any other source 
of inhomogeneity are expected to diminish with increasing 
length scale provided a scale of homogeneity exist for the dis- 
tribution. As the exact transition marked by 



(ffr)r. 



may never happen in a finite volume sample so one could 
set a very small limiting value for 1 — ,„ f f r — to identify 

\ fir J-max 

the scale of homogeneity. In our analysis we set this limit to 
1(T 4 . 



2.1 Effects of clustering, intrinsic inhomogeneity 
and finite volume 

Clustering of the points is the most important source of 
correlations between the random variables X r . Clustering 
produces fluctuations in f r which is directly related to the 
fluctuations in n(< r). When the points are clustered we 
expect on average n(< r) number of points in a volume V 
where n(< r) is given by, 



n(<r) =X (1 + Z(x))d 3 x-- 
Jv 



7o 



AV + 47rA / x 2 i{x)dx (4) 



Here A is the mean density of the distribution and V = §7rr 3 
is the volume of the sphere used. The first term has fluc- 
tuations from the Poisson noise of the same order. The 
Poisson noise rapidly decreases with increasing r and is 
only important on small scales. If the distribution itself 
is intrinsically inhomogeneous then Poisson noise will be 
modulated according to the spatially dependent intensity 
parameter \(x). The second term takes into account of 
clustering of the points. The variance of n(< r) is, 



<?n(<r) = n 2 (< r) - (n(< r)) 



(5) 



Going back to our definition of f i>r = J'ff , the 

fluctuations in this quantity is closely related to the fluctu- 
ations in rii(< r). The variance in f r is, 



„2 

°n(<r) 



[M{r)n(< r)Y 



(6) 



which is -j-jjpp times the normalized variance of n(< r). 
One can quantify the correlations of any two random 
variables X r at two different r by estimating their covari- 



ance. The correlation coefficient is, 
Gov [f ri , f r j ) 



a 



(7) 



where the indices i and j takes value between 1 to n. 
Here n is the total number of r values used in the anal- 
ysis. In this case we have a positive correlation < 
Cx„ x„ < 1 between X r - and X r . and the correlations 
would be higher when the differences between Vi and rj are 
smaller. The random vector X — (X ri , X T2 , X r3 ....X rn ) has 

M(ri)M(r2)M(rg) M(r„) equally likely outcomes and in 

principle one can estimate the full covariance matrix of all 
the random variables X r at different rs to compute their 
correlations. 

Even in the absence of any clustering and inhomogene- 
ity one would expect a positive correlations between the 
random variables as their probability distribution is derived 
from the same finite volume sample. The set of centers at 
each r is a subset of the centers at the preceding r. More 
specifically all the set of centers at each r is a subset of 
the set of the centers at the smallest r implying the spheres 
at different r share some common regions. The fractional 
amount of share is larger when the difference between r 
is smaller thereby introducing larger correlations between 
the X r s from neighboring r values. So the finite volume in- 
troduces correlations between the random variables X r at 
different r which increases the mutual information in the 
random variables. 



2.2 Effects of overlapping 

One should also keep in mind that the spheres used for 
density measurements are not independent and could share 
large overlapping regions. At each r we use a finite set of 
spheres S r = {si,r, S2,r, S3,r , Sjvf( r ),r} f° r density mea- 
surements. In general these spheres overlap with each other. 
The probability that a random point drawn out of the distri- 



bution would lie in a particular sphere S% f is 



i«r) 



where N 



is the total number of points in the distribution. The prob- 
ability that the randomly drawn point would appear some- 
where in the sample is 1. If the spheres are disjoint then 
at any r there are M(r) + 1 outcome of this experiment i.e 
the point would lie either in any one of the M(r) spheres or 
somewhere in the sample outside the spheres. But given the 
fact that the spheres overlap the point could also appear at 
the intersections of multiple spheres. Given a finite sample 
A the total probability can be written as, 



P(A) 



.MM 



P(UZX> Si , r )+P({UU 

M(r) M(r) 
= ^ P{Si,r) - P ( S i,r S 3,r) 



i=l 
M(r) 



+ ^ P(Si, r l~l Sj t r Sk,r) — •■• 

+ (-i)^- 1 P(n^ r) Si , r ) + P((uf£ ( ; ) Sl , r ) c 

M(r) 

— ^ ^ P(Si,r) Poverlap ~t~ P((Uj_j Sj,r) ) 

i=l 
= I 



(8) 
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The first sum in above equation gives the sum of probabili- 
ties for individual spheres provided they are disjoint. As the 
spheres overlap one needs to take into account the prob- 
abilities for all the possibilities. The second sum is over 
all distinct pairs of spheres, third sum is over all distinct 
triples of spheres and so forth. The last term which gives 
the probability that the point would come up in a region 
(si t r U S2, r U S3, r -- U s A /( r ), r ) c outside the union of spheres 
is important only at small r and becomes insignificant af- 
terwards. It may be noted that in an overlapping scenario 
the different terms in equation [8] can be much larger than 
1 but when summed together they will always give 1. In 
equation [8] Poveriap contains all the terms from overlapping 
spheres and successive terms in Poveriap has alternate sign 
. At smaller r the most dominant contribution to Poveriap 
comes from all pairs of overlapping spheres. The contribu- 
tion from all triples of overlapping spheres would be the 
next dominant term and so on. Thus at smaller r all the 
individual sums in Poveriap are different and the magnitude 
of each sum is lesser than its preceding sum resulting into a 
positive value of Poveriap- New sums appear in Poveriap with 
increasing r thereby increasing its value. But at the same 
time the differences between the successive sums starts de- 
creasing due to larger overlap thereby decreasing the value 
of Poveriap- There is a competition between this two effect 
and initially the first effect is more dominant than the sec- 
ond leading to an overall increase in Poveriap with increasing 
r. As we increase the radius r the numbers of valid centers 
M(r) decreases and they preferentially get more confined 
near the center of the sample. Due to the finite volume of 
the sample ultimately the second effect will dominate on 
some scale depending on the size of the finite sample and 
also depending on the nature of inhomogeneity to some ex- 
tent. For example the confinement bias is expected to be 
even higher for an inhomogeneous distribution which pref- 
erentially has more particles residing near the center of the 
sample. When the differences between the successive sums 
become smaller then there would be a larger net cancel- 
lation leading to smaller values of Poveriap- As the second 
effect starts dominating the first one, the value of Poveriap 
would start decreasing and finally on the scale of the largest 
sphere that would fit inside the finite sample, all the indi- 
vidual terms in all the individual sums in Poveriapp would be 
of the same order i.e. P(si) ~ P(si ClSj) ~ P(si Hsj PI s k ) ~ 

P(siriS2ns3 ns A f(r)) ~ 1- This would lead to a very large 

mutual cancellation of all the sums in Poveriapp decreasing 
its value to P(Llfl}[^ Si >r ) — 1 or M(r) — 1. This finite volume 
effect eventually introduces an artificial evenness in the dis- 
tribution of f r at some r depending on the sample size. We 
see that the effect of overlapping starts dominating much be- 
fore the scale of the largest possible sphere (Figure[TJ making 
an interpretation difficult on large scales. We note that all 
the other methods of testing homogeneity based on count 
in spheres n(< r) are also affected by the same problem 
making the interpretations on large scales equally difficult. 

In order to avoid these complicacy of correlations in- 
troduced by the overlapping of the spheres we also consider 
non-overlapping spheres of different radii. But the statistics 
becomes too noisy due to very small number of independent 
spheres at progressively larger radii which consequently pro- 
hibits us to address the issue of homogeneity on large scales. 

A point of caution in this context is that in our method 



■jj — - — ~ 1 at very large length scales does not necessarily 
indicate a real transition to homogeneity which is an obvious 
outcome forced by the confinement bias resulting from the 
finite volume of the sample. But with a large enough sample 
which ensures spheres upto a sufficiently large r without sig- 
nificant confinement of the centers could detect the scale of 
homogeneity if the transition happens before the scale where 
the confinement bias completely dominates the statistics. 



3 TESTS ON HOMOGENEOUS, 

INHOMOGENEOUS AND CLUSTERED 
DISTRIBUTIONS 

In order to study the prospects and limitations of the pro- 
posed method we carry out some preliminary tests by apply- 
ing it to some simple distributions. We consider (i) homo- 
geneous distribution without any clustering, (ii) inhomoge- 
neous distribution without any clustering and (iii) strongly 
clustered distribution. 

In above three cases the unevenness in type (i) is only 
due to Poisson noise. The unevenness in type (ii) distribution 
is controlled by Poisson noise too but it has the additional 
complexity that the contribution to the unevenness is gov- 
erned by the distribution of the spatially dependent intensity 
parameter A(:r). In type (iii) distribution both Poisson noise 
and the clustering together contributes to the unevenness. 

For the first two types we generate a set of Monte 
Carlo realizations of some simple homogeneous and inho- 
mogeneous point process. For the third type we use a set of 
N-body simulations where the points are strongly clustered. 



3.1 Monte Carlo simulations of homogeneous and 
inhomogeneous point processes 

We generate a set of Monte Carlo realizations of different 
types of homogeneous and inhomogeneous distributions. 

For the sake of simplicity we consider some simple radial 
density distributions p(r) — K A(r) where (i) A(r) = 4 , (ii) 
-^( r ) = ~r ana (iii) -M r ) = 1. if is a normalization constant. 
The distributions in (i) and (ii) are inhomogeneous Pois- 
son distributions which are isotropic about only one point 
and the inhomogeneities in these distributions persist on all 
scales. The distribution in (iii) is a homogeneous Poisson 
point process which has a constant density everywhere. 

Enforcing the desired number of particles N within ra- 
dius R one can turn the radial density function into a proba- 
bility function within r = to r — R which is normalized to 
one when integrated over that interval. So the probability of 
finding a particle at a given radius r is P(r) = R ' A ^ r - ) — 

J r 2 A(r) dr 

which is proportional to the density at that radius implying 
more particles in high density regions. 

We generate the Monte Carlo realizations of these dis- 
tributions using a Monte Carlo dartboard technique. The 
maxima of the function r 2 A(r) in P(r) is at r — R in (i) 
and (ii) whereas in (iii) it is same and constant everywhere. 
We label the maximum value of P(r) as Pmax- We randomly 
choose a radius r in the range < r < R and a probability 
value is randomly chosen in the range < P(x) < Pmax- 
The actual probability of finding a particle at the selected 
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radius is then calculated using expression for P(r) and com- 
pared to the randomly selected probability value. If the ran- 
dom probability is less than the calculated value, the radius 
is accepted and assigned isotropically selected angular co- 
ordinates 8 and <f>, otherwise the radius is discarded. In this 
way, radii at which particle is more likely to be found will be 
selected more often because the random probability will be 
more frequently less than the calculated actual probability. 
We choose R = 200 in h' 1 Mpc unit and N = 10 5 . We gener- 
ate 10 realizations of each of the above density distributions 
and analyze them separately with the method described in 
section 2. 



3.2 N-body simulations 

We simulate the dark matter distribution using a Particle- 
Mesh (PM) N-body code. The simulations use 256 3 par- 
ticles on a 512 3 mesh and cover a comoving volume of 
[921.6/i- 1 Mpc] 3 . We use (f2 m0 , Ovo, = (0.27,0.73,0.71) 
for the cosmological parameters along with a ACDM power 
spectrum w ith spectral index n 3 = 0.96 and normalization 
a 8 =0.812 (iKomatsu et all 120091). A simple "sha rp cutoff" 
biasing, scheme ( Cole. Hatton fc Weinberg II 1998h was used 
to extract particles from the simulations that are biased rel- 
ative to the dark matter and are labelled as galaxies. The 
bias parameter b of each simulated biased sample was esti- 
mated using the ratio 



(9) 



where £ g (r) and £dm( r ) are the galaxy and dark matter two- 
point correlation functions respectively. This ratio is found 
to be constant at length-scales r > 5/i _1 Mpc and we use 
the average value over 5 - 40/i _1 Mpc. We use this method 
to generate galaxy samples with bias values b = 1.5, 2 and 
2.5. We run the simulation for three different realizations of 
the initial density fluctuations and extract biased samples 
from each of them. We extract randomly N = 10 5 particles 
from three non overlapping spherical regions of radius R = 
200/i -1 Mpc from each of the simulation boxes giving us 
total nine samples for each of the bias values. The numbers 
N and R and the specific spherical geometry are chosen just 
to maintain the uniformity in all the analysis presented here. 
Generally one can choose any number of particles and any 
geometry for the samples. 



4 RESULTS AND CONCLUSIONS 

In the Figure [2] we show the variations of jjry — with dis- 
tance r for some homogeneous and inhomogeneous distri- 
butions described in subsection 3.1. For the Poisson distri- 
bution the value of , g ? r — shows a small departure from 
1 at smaller values of r which decreases with increasing r. 
The ratio reaches a value ~ 1 at r ~ 15. As t he points are 
uncorrelated, the Void Probability function (|Whitel Il979l ) 
for a homogeneous Poisson distribution is e~ xv . It is ex- 
pected that such a distribution would become homogeneous 
on a scale r ~ A~3, which is a measure of the average size 
of voids in the distribution. Since the correlations between 
the random variables X r in the homogeneous Poisson dis- 
tribution mainly comes from the finite volume effect one 



.ACDM, b=1 



_» Poisson 

...Mr)=lA 

,,A(r)=1/r 2 



1 20 30 40 50 60 70 80 901 00 1 □ 20 30 40 50 60 70 80 90200 
r (V 1 Mpc) 

Figure 1. This shows the P over i a p (equation |8j as a function of 
r for different distributions as indicated in the figure. The error 
bars overplotted on the data points here are not visible due to 
their very small sizes. 
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A (r) = 1/r 
^ A (r) = 1/r : 
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Figure 2. This shows the ratio 



II, 



as a function of r for 



homogeneous Poisson distribution and two different anisotropic 
distributions. The tiny error bars overplotted on the data points 
are invisible here. 




o 10 20 30 40 50 60 70 80 90 100110120130140150 
r (h 1 Mpc) 



Wrh, 



as a function of r 



Figure 3. This shows the ratio 
for the unbiased ACDM model and its different biased variants. 
1 — a error bars obtained from 9 different samples are overplotted 
at each points. 
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would expect a lesser amount of mutual information in the 
set of random variables and consequently a faster informa- 
tion leakage but the information in X r i.e. 1— i H > r — would 
never be exactly zero due to persistence of the finite volume 
correlations at all r. 

The nj~f — - values at smaller r values shows relatively 
larger departure from 1 for both the inhomogeneous distri- 



that increasing the radius r beyond the typical scale of the 
nonlinear structures in the biased distributions would lead to 
larger disparity in n(< r) and f r values and hence the larger 
inhomogeneities. The scales corresponding to the reversal in 
the behaviour of ratio corresponds to the typical scale of the 
nonlinear structures present in the biased distributions. Fur- 
ther on large scales the correlation functions £(r) in biased 
butions considered here. The ratio has a larger departure for distributions are b 2 times larger than the f (r) in unbiased 



the distribution with A(r) = \ than the distribution with 
A(r) = \ on all scales. This shows that the ratio nj^r — 
describes the degree of inhomogeneity present in two dif- 
ferent anisotropic distributions in correct order. Both these 
distributions are only isotropic about one point and have 
a radial density distribution which does not allow them to 

1 at 



(Hr-)r, 



be homogeneous on any scales. The ratio 
r ~ 120 for both of them simply shows the importance of 
confinement bias which ultimately is expected to take over 
the situation at some r depending on the sample size and 
type of inhomogeneity present. The confinement bias gets 
a larger boost in both of these distributions (Figure [TJ as 
there are already more particles preferentially located near 
the center of the samples. This effect produces larger over- 
lap and hence larger mutual information are stored in the 
random variables from neighbouring rs. Consequently this 
accelerates the leakage of information in X r with increas- 



ing r finally forcing the ratio 



Wrh, 



1. It may be noted 



that for the homogeneous Poisson distribution there are no 
such additional boost coming from preferential deposition 
and further there are no extra correlations due to cluster- 
ing. The transition p^y- — - ~ 1 is reached there at a r much 
before the scale where the confinement bias play a dominant 
role indicating a real transition to homogeneity. 

In the Figure [3] the variations of jj^y — with distance 
r are shown for ACDM N-body simulations with different 
linear bias values. The assumption of linear bias holds rea- 
sonably well on large scales. Different types of galaxies are 
biased differently with respect to the dark matter and the in- 
homogeneities in the distributions of different types of galax- 
ies could be modulated by their bias values. We would also 
like to investigate here how well the ratio 



(Hr)r. 



can track 



the variation in inhomogeneities in a distribution on differ- 
ent scales and give some useful information about the char- 
acteristics of the inhomogeneities present in them. We see 
that the ratio initially depart from 1 at small r for all the 
distributions with higher bias values showing systematically 
lesser inhomogeneity (lesser information in X r ) than lower 
bias values. In the unbiased ACDM model the centers used 
for calculating n(< r) are residing in all types of environ- 
ments (clusters, sheets, filaments and voids). The fact that 
the centers are distributed across different types of nonlin- 
ear structures increases the unevenness in the distribution 
of f r and the mutual information in X r . Whereas with in- 
creasing bias values the particles in a biased distribution 
preferentially represent progressively higher density peaks in 
the density field homogenizing the distribution on the corre- 
sponding length scales. This is primarily due to the fact that 
in a biased distribution the centers are located in less diverse 
environments. A complete reversal of this behaviour is seen 
at 15—20 h~ Mpc after which with increasing bias values the 
distributions systematically show larger inhomogeneity and 
more information in X r at all scales r until they eventually 



distributions introducing larger correlations among all the 
random variables X r at larger r. Eventually all the curves 



merge to 



(H r )r. 



= 1 in the range 100 - 150 /i -1 Mpc with 



the difference that the transition appears to happen at a 
relatively larger r values for larger bias. Although this can 
not be emphasized due to the size of the overlapping er- 
ror bars at those length scales. It may be noted that the 
inhomogeneities in the ACDM model with different biases 
are much smaller as compared to the inhomogeneous radial 
density distributions given by A(r) = \. The information in 
X r for a Poisson distribution mostly comes from the finite 
volume effect whereas the very large mutual information in 
the random variables in two inhomogeneous distributions 
is the combined outcome of finite volume correlations and 
large degree of confinement bias. In the unbiased and biased 
ACDM models the mutual information are generated by cor- 
relations from confinement bias and clustering together. It 
may be noted that the confinement biases are much smaller 
and very similar in the ACDM models and the homogeneous 
Poisson distributions (Figure [l|. The confinement biases be- 
come very large in the two anisotropic distributions consid- 
ered here which eventually force the ratio jj^y — to 1 at 
some scales. 

If a scale of homogeneity exists then ideally one would 
expect a naturally emerging evenness in f r when the spheres 
of that radius around each centers statistically include simi- 
lar numbers of different types of nonlinear structures. So pro- 
vided that the correlations introduced by the finite volume 
of the sample and the confinement bias are much smaller 
compared to the correlations induced by clustering then the 
leakage of information in X r would ideally track the varia- 
tion of inhomogeneity in the distribution with scale r. But 
unfortunately in a real situation one has to also deal with 
the correlations due to finite volume and confinement bias 
modulating the information in X r or the evenness of /,-. 
We find that the confinement bias could start to dominate 
even much before the scale defined by the largest sphere 
included in any finite sample (Figure [T}. The degree of cor- 
relations introduced among X r s would also depend on the 
type of inhomogeneity present in the distribution specially 
if it is overpopulated near the center of the sample with 
respect to the rest of the volume resulting into larger over- 
lap between the spheres. When dealing with galaxy redshift 
surveys one has to also properly take into account the dif- 
ferent selection effects involved, redshift space distortions 
and the specific geometry of the samples. We plan to carry 
out analysis in the publicly available ca talogues from large 



galaxy redshift surveys (e.g. 2dFGRS . ICo llcs ct al 



SPSS. IStoughton et all 12003 : 2MASS. iHuchra et al 



2001 



2012) 



merge to 



(Hr)r, 



— 1. This behaviour is related to the fact 



in future works and investigate these issues further. It may 
be noted here that a caveat of our method is we assume 
that we always have access to the data samples on a spatial 
hypersurface of constant time. Though this assumption is 
approximately valid in case of low redshift galaxy samples 
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in the nearby Universe or data samples from a snapshot of 
N-body simulations, it is not strictly true as the entire ob- 
servational samples do not consist of objects on a constant 
time hypersurface but rather on a light cone. Taking this into 
account it is hard to distinguish radial inhomogeneity from 
time evolution without assuming a cosmological model. Con- 
sequently this could make an inhomogeneous but isotropic 
model (e.g. anti-Copernican void models) to look like a ho- 
mogeneous one on large scales. Fortunately these models can 
be constrained with other observations such as SNe, CMB, 
BAO and measurement s of Hubble parameter (IZibin et alj 
120081 : IClifton et al]|2008l : iBiswas et akboid : IClarksorj2oTi K 
Finally we note that the method presented here has 
the desired ability to characterize inhomogeneities and their 
variations on different length scales in any 3D distribution of 
points. The method has the potential to successfully identify 
any existing scale of homogeneity given a sufficiently large 
volume which can ensure negligible confinement bias and 
less effective overlap between the spheres upto large length 
scales. Given the large survey volumes that are currently 
available from the modern redshift surveys the method could 
provide an efficient tool for exploring the issue of Cosmic 
homogeneity. 
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